Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation
نویسندگان
چکیده
When training and test samples follow different input distributions (i.e., the situation called covariate shift), the maximum likelihood estimator is known to lose its consistency. For regaining consistency, the log-likelihood terms need to be weighted according to the importance (i.e., the ratio of test and training input densities). Thus, accurately estimating the importance is one of the key tasks in covariate shift adaptation. A naive approach is to first estimate training and test input densities and then estimate the importance by the ratio of the density estimates. However, since density estimation is a hard problem, this approach tends to perform poorly especially in high dimensional cases. In this paper, we propose a direct importance estimation method that does not require the input density estimates. Our method is equipped with a natural model selection procedure so tuning parameters such as the kernel width can be objectively optimized. This is an advantage over a recently developed method of direct importance estimation. Simulations illustrate the usefulness of our approach.
منابع مشابه
Covariate Shift Adaptation by Importance Weighted Cross Validation
A common assumption in supervised learning is that the input points in the training set follow the same probability distribution as the input points that will be given in the future test phase. However, this assumption is not satisfied, for example, when the outside of the training region is extrapolated. The situation where the training input points and test input points follow different distr...
متن کاملStochastic Density Ratio Estimation and Its Application to Feature Selection
In this work, we deal with a relatively new statistical tool in machine learning: the estimation of the ratio of two probability densities, or density ratio estimation for short. As a side piece of research that gained its own traction, we also tackle the task of parameter selection in learning algorithms based on kernel methods. 1 Density Ratio Estimation The estimation of the ratio of two pro...
متن کاملContinuous Target Shift Adaptation in Supervised Learning
Supervised learning in machine learning concerns inferring an underlying relation between covariate x and target y based on training covariate-target data. It is traditionally assumed that training data and test data, on which the generalization performance of a learning algorithm is measured, follow the same probability distribution. However, this standard assumption is often violated in many ...
متن کاملLearning under Non-Stationarity: Covariate Shift and Class-Balance Change
One of the fundamental assumptions behind many supervised machine learning algorithms is that training and test data follow the same probability distribution. However, this important assumption is often violated in practice, for example, because of an unavoidable sample selection bias or non-stationarity of the environment. Due to violation of the assumption, standard machine learning methods s...
متن کاملDirect Density Ratio Estimation for Large-scale Covariate Shift Adaptation
Covariate shift is a situation in supervised learning where training and test inputs follow different distributions even though the functional relation remains unchanged. A common approach to compensating for the bias caused by covariate shift is to reweight the training samples according to importance, which is the ratio of test and training densities. We propose a novel method that allows us ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007